Evaluation of a branch target address cache

نویسندگان

Sreeram Duvvuru

Siamak Arya

چکیده

Branches interrupt the sequential flow of instructions and introduce pipeline bubbles. Branch penalty can be a significant component of effective cpi (cycles per instruction) in multiple instruction issue processors. Two key issues need to be resolved to alleviate this problem: a branch resolution scheme to decide the direction and target of a branch early in the pipeline, thus allowing target instruction fetch to start, and mechanisms to minimize the impact of unpredictable branches. We propose a technique of cacheing branch target addresses for our fully predicated processor architecture, that would allow the branch decision to be made in the fetch stage of the pipeline. We discuss the impact of different branch target cacheing policies and cache sizes on the efficiency of branch target address cache. Impact of register-relative branches which may have variable target addresses is considered and a so lution is suggested.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Effectiveness of microarchitecture test program generation - Design & Test of Computers, IEEE

FSM Models As formulated in prior work, Figure 2 shows the FSM model for each of the 512 branch history table entries.15-17 A cold start initializes all entries in the branch history table to the start state, strong not taken. Any conditional branch whose address directly maps to the same branch history table entry will cause transitions in that entry’s FSM when the branch is resolved in the ex...

متن کامل

Don't Use the Page Number, but a Pointer to It

Most newly announced high performance microprocessors support 64-bit virtual addresses and the width of physical addresses is also growing. As a result, the size of the address tags in the L1 cache is increasing. The impact of on chip area is particularly dramatic when small block sizes are used. At the same time, the performance of high performance microprocessors depends more and more on the ...

متن کامل

Branch Prediction Strategies Using Instruction Cache

Pipelining is the major organizational technique that computers use to achieve high performance. Ideally, a pipeline uniprocessor can run at a rate that is limited by its slowest stage. Branches in the instruction stream disrupt the pipeline, by stalling and/or ushing of the pipeline, and reduce the processor performance well below ideal. Since branch instructions constitute a signiicant percen...

متن کامل

Reducing Branch Delay to Zero in Pipelined Processors

A mechanism to reduce the cost of branches in pipelined processors is described and evaluated. It is based on the use of multiple prefetch, early computation of the target address, delayed branch, and parallel execution of branches. The implementation of this mechanism using a Branch Target Instruction Memory is described. An analytical model of the performance of this implementation is present...

متن کامل

Omitting Cache Look-Up for High-Performance, Low-Power Microprocessors

In this paper, we propose a novel architecture for low-power direct-mapped instruction caches, called “historybased tag-comparison (HBTC) cache”. The cache attempts to reuse tag-comparison results for avoiding unnecessary tag checks. Execution footprints are recorded into an extended BTB (Branch Target Buffer). In our evaluation, it is observed that the energy for tag comparison can be reduced ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1995

Evaluation of a branch target address cache

نویسندگان

چکیده

منابع مشابه

Effectiveness of microarchitecture test program generation - Design & Test of Computers, IEEE

Don't Use the Page Number, but a Pointer to It

Branch Prediction Strategies Using Instruction Cache

Reducing Branch Delay to Zero in Pipelined Processors

Omitting Cache Look-Up for High-Performance, Low-Power Microprocessors

عنوان ژورنال:

اشتراک گذاری